Counter-Factual Meaningfulness and the Bell and CHSH Inequalities 
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We discuss the role of counter-factual meaningfulness (a weaker cousin of "counter-factual def- 
initeness") as a premise in the derivation of the Bell and CHSH inequalities. The basic question 
motivating the discussion is this: can the CHSH inequality, unlike the original Bell inequality, be 
derived without making a hidden-variables (or equivalent counter-factual definiteness) assumption? 
We answer, somewhat tentatively, in the negative, and suggest that an appropriately-modified ver- 
sion of the EPR argument is needed to rigorously establish that the empirical violation of Bell-type 
inequalities can only be blamed on the failure, in nature, of local causality. 



I. INTRODUCTION 

In a recent paper I characterized interpreters of 
Bell's Theorem as falling into two classes: (1) those 
who think that Bell's result (and the associated experi- 
mental data) prove that non-locality is a necessary fea- 
ture of any empirically viable theory and hence a feature 
of nature itself, and (2) those who think that the results 
prove merely that non-locality is a necessary feature of 
any empirically viable theory of hidden variables (HV - 
or any theory containing a counter-factual definiteness 
(CFD) property, or some other similar anti-orthodox or 
anti-Copenhagen character). [Tll | In that earlier paper 
I took it for granted that Bell's derivation of the (origi- 
nal. Bell) inequality did use a HV/CFD assumption, but 
argued that a modified (rigorous) version of the EPR ar- 
gument Q estabhshes that the necessary HV/CFD prin- 
ciple follows from locality - and hence should not be 
thought of as a separate axiom behind Bell's inequality, 
restricting the class of theories to which the inequality is 
applicable (and hence likewise restricting the class of the- 
ories which the theorem shows must be non-local in order 
to agree with experiment). In short, I argued that local- 
ity alone gives rise to the inequality (but in a two-step 
dance consisting of the modified EPR argument followed 
by the derivation of the Bell inequality), thus demon- 
strating the correctness of the interpreters in "class (1)" 
from above. 

An anonymous referee for that earlier paper, however, 
questioned whether this sophisticated two-step argument 
was really needed, since the CHSH Q inequality (unlike 
the original Bell inequality, he suggested) could be de- 
rived without assuming HV/CFD. That is, the referee 
claimed that the CHSH inequality follows (in one step, 
so to speak) from locality alone - and hence its empirical 
violation demonstrates already that nonlocality is a fact 
of nature, without further analysis or discussion. Q 

I was initially skeptical of this claim, operating under 
the assumption that the difference between the various 
(generalized-) Bell inequalities is (roughly) ease of em- 
pirical testing, and not anything about their logical "in- 
puts". After a rather (and, in retrospect, unfairly and 
unfortunately) dismissive response to the referee, a voice 



in the back of my head urged me to think about this 
matter further, which I proceeded to do. 

The first puzzle raised by this further thinking was 
this: is it even true that the (original) Bell inequality has 
some HV/CFD assumption built into it? This, I think, 
is widely accepted as a truism and there's no doubt that 
many of the later attempts to popularize, explain, and 
derive a Bell inequality do make a HV/CFD assumption 
explicitly. (Mermin's various derivations using the idea of 
"instruction sets" are probably the clearest examples 
here.) But where, exactly, does this assumption appear 
in Bell's original derivation? The answer turned out to 
be more subtle than I expected. But finding, eventually, 
that the answer is (probably) "yes" put me on the path 
toward eventually identifying a similar (though not iden- 
tical, and the difference is subtle and interesting) sort of 
HV/CFD assumption that appears also in the derivation 
of the CHSH inequality. 

So I am left, in the end, (basically) agreeing with my 
initial sense, though also embarrassed at how superficial 
my prior basis for this conclusion had been. I am also left 
with a much deeper appreciation for why there has been 
such long and lingering disagreement between advocates 
of (1) and (2) from the first paragraph above. For the 
sense in which HV/CFD appears in the CHSH derivation 
is subtle and not obvious, and it is indeed still not crystal 
clear to me that this assumption is even really there! So 
the final conclusion in my mind is this: it's good that 
the modified EPR argument {from locality to HV/CFD) 
exists to unequivocally settle the dispute in favor of (1). 

But the real point of the current essay is the interesting 
journey, not so much the conclusion. So let us jump in 
with that. 



II. BELL 

Let's get notation out of the way first. Throughout, 
we will consider the standard EPR-Bell setup, in which 
a central source emits oppositely-directed particles (elec- 
trons, say) in the spin-entangled singlet (total spin zero) 
state. (That, at least, is how orthodox QM describes 
the relevant state; other candidate theories might give a 
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more or less or differently detailed account of the parti- 
cle pair's state.) Note in particular that the state is not 
a spin eigenstate for either of the individual particles; 
i.e., orthodox QM (OQM) attributes no definite spin- 
component values to either particle individually prior to 
measurement. 

After the particles fly apart to some large distance (so 
that subsequent spin-component measurements, which 
take some finite time, can nevertheless be made with 
spacelike separation), two experimenters (Alice and Bob) 
randomly choose spatial axes (d and b respectively) along 
which to measure the spins of their particles. The out- 
comes of these measurements are bivalent, and we'll 
use units in which the possible outcomes are A — ±1, 
B = ±1. 

Bell's derivation of the inequality gets started as fol- 
lows: consider a theory (not necessarily OQM) according 
to which a complete description of the state of the parti- 
cle pair (on some hypersurface just prior to the measure- 
ments, say) is denoted A. Note that A could be merely the 
QM wave function, or it could include more - or indeed 
less - structure attributed to the particles. The mere no- 
tation (A) commits us, really, to nothing. I stress this 
point because there is an unfortunate tendency in the 
Bell literature to call A "the hidden variable" in contexts 
in which there is no reason or need to commit to this 
(i.e., to commit to the claim that A attributes additional 
structure to the pair's state, compared with OQM's wave 
function). This tendency has no doubt contributed to the 
confusion about whether, and how, some HV/CFD type 
assumption is present in the derivation of the inequal- 
ities. We will go beyond orthodoxy soon enough, but 
let's not fool ourselves into thinking that the mere idea 
of a complete state description (or an arbitrary choice of 
symbol for it) commits us to anything anti-orthodox. 

Following Bell, let us now assume that the outcomes 
are determined [l^ - i.e., that there exist functions 

A{a,b,X)=±l (1) 

and 

B{a,b,X)=±l (2) 

which give the outcomes of Alice's and Bob's experiments 
under the relevant conditions. Let us then require local- 
ity (i.e., each outcome is determined without influence 
from the distant apparatus orientation), so the relevant 
functions have the form 

A(a,A) = ±l (3) 

and 

i3(S,A) = ±l. (4) 

Now, clearly, we are no longer talking about OQM, and 
for (already) two distinct reasons. The first is determin- 
ism: Bell assumes that for a given A and a given d and 6, 



unique outcomes are determined. But OQM is not a de- 
terministic theory. Were we to insist on maintaining alle- 
giance to orthodox quantum philosophy, we should talk 
only of probabilities for the various possible outcomes, 
e.g., P{A\a,b, X). (As we will discuss later, the major 
advance of the CHSH inequality relative to the original 
Bell inequality is that it does away with this assumption 
of determinism; but more on that later.) 

Bell also goes beyond orthodoxy in requiring locality. 
Orthodox QM simply does not respect Bell's locality con- 
dition (elaborated in 1]). OQM is not a local theory. 

To summarize. Bell has us consider a deterministic lo- 
cal hidden variable (LHV) theory. "Deterministic" and 
"local" are obvious; the theory being considered is a "hid- 
den variable" theory because the complete state descrip- 
tion (A) cannot merely be the quantum mechanical wave 
function (which simply does not attribute enough struc- 
ture to the particles to uniquely and locally determine 
the outcomes of spin measurements!). 

Continuing with the functions A{a, A) and B{b, A), Bell 
has us consider the expected value of the product of the 
outcomes of Alice's and Bob's measurements: 

E{a,b) = J dX p{X)A{a,X)B{b,X) (5) 

where p(A) is the probability density for the "singlet 
state" pair preparation procedure to produce the state 
A. 

Bell now imposes the perfect correlation requirement 
(a special case of the predictions of OQM): whenever Al- 
ice and Bob measure along the same axis, they will get 
opposite outcomes. Thus 

A{b,X) = -Bib,X) (6) 

which allows us to rewrite the correlation function JSJ as 

E{a, b) = - J dX p{X) A{a, X)A{b, A). (7) 

This expression contains the first apparent seeds of an ad- 
ditional (third) anti-orthodox assumption in the deriva- 
tion. Taken straight, the meaning of this expression is 
something like: 

• the expected value for the product of 

(i) the outcome of Alice's measurement along a, 
and 

(ii) the outcome of Alice's measurement along b. 

The important point here is not that the expression im- 
plies the existence of a particular value for the outcome 
of either measurement. That aspect is justified by the 
already- noted assumption of determinism. What's new 
and crucial here is the (apparent) requirement that the 
values ^(d. A) and A{b, A) are simultaneously meaning- 
ful, even though at most one of the measurements (along 
d or b) can actually be performed. The whole idea of 
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a meaningful correlation (or more precisely its expec- 
tation value) between the outcomes of two incompati- 
ble measurements, surely goes beyond the verification- 
ist/positivist emphasis of the orthodox framework, in 
particular Bohr's insistence that "the measuring instru- 
ments ... serve to defirie the conditions under which the 
phenomena appear." 2, pg 2] 

I find it helpful to think of this orthodox principle as 
asserting the "contextuality" of measurement outcomes: 
because (according to the orthodox view) the outcomes 
are not pre-encoded in the measured object, but only 
arise in the interaction of the object with the measuring 
apparatus, it is meaningless to refer to the outcomes of 
merely hypothetical measurements. "Unperformed ex- 
periments have no results." Or: "No elementary phe- 
nomenon is a phenomenon until it is a registered (ob- 
served) phenomenon." Q 

So in addition to the determinism and locality assump- 
tions already noted, we have here a third assumption 
that might be called "non-contextuality" or "counter- 
factual meaningfulness" (CFM - the idea being that it is 
meaningful to talk about the outcomes of not-actually- 
performed measurements) . 

It is rather difficult, however, to separate this third as- 
sumption from the first- noted assumption (determinism). 
The contrast here would be a theory that is anti-orthodox 
in (say) the first two ways (it is deterministic, and it is lo- 
cal) but which is nevertheless orthodox in the third sense: 
according to this hypothetical theory, the outcomes of 
measurements are brought into being (deterministically 
and locally) by an interaction with the measurement ap- 
paratus, so while it is perfectly meaningful to speak of 
A{a, A) (in a situation where a measurement along a is 
actually occuring) or of A(b, A) (in a situation where a 
measurement along b is actually occuring), it would be 
impossible to speak meaningfully of the product 



A{a,\)A{b,X) 



(8) 



since the measurement apparatuses needed to make (re- 
spectively) the first and second factors meaningful, are 
incompatible. So there can be no situation in which the 
product is meaningful. 

Actually, the last few paragraphs have been delib- 
erately (but justifiably) misleading. This third anti- 
orthodox assumption (CFM or non-contextuality) is ac- 
tually not implied by the mere writing of Equation ((TJ - 
a point that is clear if we just remember where Equation 
(|7|) came from. Recall that the apparently problematic 
expression came from the combination of the unproblem- 
atic definition of the correlation function - Equation ((Sj) 
- with the perfect correlation condition - Equation l^. 
Remembering this allows us to give meaning to the cor- 
relation as expressed in Equation Q without making any 
CFM assumption, simply by reading the factors as 

(i) A{a, A) = the value the theory predicts for A when 
Alice measures along a (with the pair in state A), 
and 



(ii) A{b, A) = the opposite of the value the theory pre- 
dicts for B when Bob measures along b (with the 
pair in state A). 

And since the two relevant measurements here (Alice's 
along a and Bob's along b) are perfectly compatible, we 
need not assume the meaningfulness of any measurement 
outcomes which are not (and indeed cannot be) actually 
performed. 

By parsing it this way, we see that this third sort 
of anti-orthodoxy is not actually present in Equation 
{TI). But our reason for going into this is that this anti- 
orthodox sort of CFM is assumed in the next step in 
Bell's derivation, and in a way that cannot be eliminated 
by any re-parsing of the problematic mathematical ex- 
pressions. Let us see how this comes about by following 
through with Bell's derivation. Consider the difference 
in correlation functions, expressed as in Equation |(7J), 
for two different pairs of angles: 

E{a,b)~ E{a,c) (9) 
= / d\p[X) \A[a,\)A{c,\) - A{a,\)A{b,X) . 



So far so good. But now we insert unity, into the first 
term in the square brackets, in the form 



1 = A{b,X)A{b,X) 



(10) 



(justified by the idea that A{b,X) = ±1 so that, either 
way, its square is one). This leaves us with 

E{a,b) - E{a,c) 

d\p{\) \A{a,\)A{b,\)A{b,\)A{c,\) 



A{d,\)A{b, A) 



(11) 



d\ p{\) A{d, X)A{b, A) A{b, A)A(c, A) - 1 [12 



From here it is straightforward to get Bell's inequality. 
Taking absolute values on both sides, using the fact that 
1^1 < 1, and identifying 

d\p{\)A{b,\)A{c,\)=E{b,c) (13) 

gives Bell's inequality: 

E{a,b) ~ E{d,c) <1 + E{b,c). (14) 

The reader has probably already noticed the step to 
which we wish to call attention. One of the terms in 
Equation ((TT1) reads 

(15) 



dX p{X) A{a, X)A{b, X)A{b, X)A{c, A) 
which, in words, evidently means 
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• the expected value for the product of 

(i) the outcome of Alice's measurement along a, 

(ii) the outcome of Alice's measurement along b 
(squared), and 

(iii) the outcome of Alice's measurement along c. 

Now the same kind of analysis we gave before will go 
through, but this time without the escape hatch that 
saved us before - namely, the re-parsing of one of the 
two apparently- incompatible factors in terms of another, 
actually-compatible measurement (namely Bob's). For 
even supposing we parse one of the group { (i), (ii), and 
(iii) } in terms of a measurement made by Bob, we are 
here left inevitably with two incompatible measurement 
contexts (e.g., a and b) for Alice. And so, from the point 
of view of any theory that respects this third orthodox 
principle, such an expression as Equation H15|l is simply 
meaningless - and therefore anything that comes after its 
appearance (such as Bell's inequality) is equally meaning- 
less. And hence Bell's inequality simply would not apply 
to the kind of theory we raised the possibility of before: 
one which is deterministic and local, but which accepts 
the orthodox principle of contextuality (i.e., which denies 
CFM). 

Summing up, it appears that Bell's derivation of the 
inequality is premised on three distinct assumptions: 

• determinism 

• locality 

• CFM (or non-contextuality) 

any of which might, in principle, be rejected in the face 
of empirical data contradicting the inequality. 

One point should be addressed here before finally mov- 
ing on to consider the CHSH inequality. Someone skep- 
tical of the previous discussion could raise the following 
objection: surely Alice needn't actually set up a measure- 
ment apparatus oriented along b in order to justify the 
statement that A{b, X)A{b, X) — 1. Even a merely imag- 
ined measurement along b (even one which conflicts with 
an actually performed measurement along, say, d) must 
obey A{b,X) = ±1, and hence A{b, X)'^ = 1. Can't we 
then parse the allegedly-objectionable factors in Equa- 
tion 1)15(1 in terms of such an imaginary measurement, 
thus removing the need for any CFM-type assumption? 

The problem with (i.e., the answer to) this objection 
is that "imagination" provides far too much leeway. For 
example, couldn't we imagine that A(b, X) = +1 ... and 
then imagine that A{b,X) = — 1 ... so the product is 
— 1 rather than the required +1? Or, for that matter, 
couldn't we imagine ^(S,A) = or 177r/V2 or any other 
value whose square is not -1-1? Of course, the objector will 
want to say that the imaginations must be constrained by 
the (local deterministic) theory, which must (by prior as- 
sumptions) attribute either the value -|-1 or the value — 1 
to Alice's (real or imaginary) measurement along b. But 



this just brings out the fundamental problem with the 
objection. Such a theory need only attribute this value 
to Alice's measurement for an experimental context in 
which a measurement along b actually occurs. There are 
no constraints whatever on what value (if any) such a 
theory must attribute to a measurement along the b di- 
rection in an experimental context in which Alice is actu- 
ally measuring along the d direction. Indeed, in principle, 
for a contextual theory, there can be no such attributed 
value, for the very idea being considered (the outcome of 
a measurement that is incompatible with another actu- 
ally performed measurement) is simply meaningless. 

And so the original claim - that Bell's derivation of the 
inequality requires not only locality and determinism, but 
also a CFM-type assumption - stands. 



III. CHSH 

The main difference between the Bell inequality and 
the CHSH inequality is that the latter does not require 
determinism. Instead of beginning with an assumption 
that the outcomes A and B are fi,xed once the experimen- 
tal context (a, b) and particle-pair state (A) are specified, 
CHSH require only that a theory specify the probabilities 
for the various possible outcomes. Thus, instead of func- 
tions ^(d, A) and -6(6, A), we begin with two probability 
functions: 

P{A\d,X) (16) 

and 

PiB\b,X). (17) 

Note that we have already imposed the locality condi- 
tion, whereby the probability (assigned by the theory in 
question to the various possible outcomes) depends only 
on facts which are locally accessible to the experiment in 
question. Thus, for example, the probability for differ- 
ent outcomes A depends only on d and A - not on the 
distant setting b or the distant outcome B. (Note that it 
is only because the state description A is assumed to be 
a complete description, that the non-dependence of the 
probability of A on the distant outcome B follows from 
local causality.) 

For such a (local, but not necessarily deterministic) 
theory, the expectation value of the product of the two 
outcomes will be given by: 

E{d,b)^ I dXp{X)^ABP{A\d,X)P{B\b,X). (18) 

A,B 

Since A = ±.1 and i? = ±1, this can be simplified to 

E{d,b)= [ dXp{X)A{d,X)B{b,X) (19) 
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where 



A{a,\) = AP{A\a,X) 



= P{A ^ +l\a, X) - PiA ^ -l\a, X) (20) 

is the theory's prediction for the average value of Ahce's 
experiment (with the apparatus set along a and with par- 
ticles in the state A). (And similarly for B.) 

Thus parsing Equation itTD)) : the expected value of the 
product of Alice's and Bob's experiments is given by a 
weighted average (over all the pair states that could be 
produced by the preparation procedure) of the product 
of 

(i) the average value for Alice's measurement, and 

(ii) the average value for Bob's measurement 

where the averages here are averages over the possible 
outcomes allowed by the theory when the experimental 
context a and b is realized. 

Continuing with the derivation, let us consider, as be- 
fore, the difference in the correlation function for two 
different pairs of angles: 

E{a,b) ~ E{a.b') = (21) 

dX p{X) [i(a, A)S(6, A) - i(a, A)B(6', A) 



where b and b' refer to two distinct settings of Bob's 
apparatus (as will, likewise, a and a! refer shortly to two 
distinct settings of Alice's apparatus). So far so good. 
But now, to continue with the derivation, we need to 
add zero inside the integrand in the clever form 



= ±A{a,X)A{a' ,X)B{b,X)B{h' ,X) 
T^(a, X)A{a', X)B{b, X)B{b', A). 



(22) 



Rearranging and factoring 9] inside the integrand then 
gives 

E{a,b) ~ E{a,b') = 

dX p{X) A(a, X)B{b, A) (l ± i(a', X)B{b' , A)) 

- A(a, A)S(S', A) (l ± A{d\ X)B{b, A)) (23) 

which is easily reduced, by taking absolute values on both 
sides, to 

E{a,b) - E{a,b') + E{d' ,b') + E{a' ,b) <2 (24) 

which is the CHSH inequality. 

The reader has probably already noticed the step to 
which we wish to draw attention. The two terms (which 
add to zero) that appear for the first time in Equation 
(|23|l have the form 



(25) 



dX p{X) A{d, A) A{d', A) B{b, A) B{b', A) 
which, in words, is evidently 



• the expected value for the product of 

(i) the average value for the outcome of Alice's 
measurement along a, 

(ii) the average value for the outcome of Alice's 
measurement along a', 

(iii) the average value for the outcome of Bob's 
measurement along b, and 

(iv) the average value for the outcome of Bob's 
measurement along b' . 

which, like the similar terms discussed in the previous 
section, would only seem to be meaningful for a theory 
of non-contextual hidden variables. That is, for a the- 
ory according to which the experimental outcomes are 
brought into existence by some kind of interaction be- 
tween the measuring apparatus and the measured object, 
it would be meaningless to talk simultaneously about the 
outcomes of incompatible measurements (such as Alice's 
measurements along both a and a'). 

Of course, one might object, we are not here forced, 
by the algebra, to talk about specific outcomes for these 
pairs of incompatible measurements. Instead, we only 
need simultaneous talk about the averages of those pairs 
of incompatible measurements. This objection is correct 
as far as it goes: the assumption here is indeed weaker 
than the full "counter-factual definiteness^^ which is used 
in the derivation of the original Bell inequality. But there 
is still here, in the CHSH derivation, an assumption of the 
weaker condition we have called CFM: counter-factual 
meaningfulness. This parallels closely the discussion in 
the previous section, so we need not elaborate in great 
detail. Suffice it to point out that locality alone (which is 
the only explicit assumption noted so far in the derivation 
of the CHSH inequality) does not (in any obvious way) 
warrant a claim that the average value for a measurement 
by Alice along a' should be the same under two scenarios: 
first, the measurement along d' is actually performed, 
and second, the measurement along a! is performed while 
the apparatus is oriented along d. Indeed, it is difficult to 
understand how a theory could yield up any determinate 
value for the average under the second set of conditions 
since those conditions are in principle unrealizable - they 
involve a contradiction, because the apparatus cannot be 
simultaneously aligned along both d and a'. 

Of course, there is no such problem in a theory of non- 
contextual hidden variables - i.e., a theory in which the 
outcomes of experiments (or the probabilities for vari- 
ous possible outcomes) are pre-encoded in the state of 
the object alone, with the role of the experimental ap- 
paratus being simply to reveal those pre-existing values 
(or to reveal one of the possible values according to the 
pre-existing probability distribution that is, so to speak, 
encoded in the state of the "measured" object). In such 
a theory, we can understand the problematic expressions 
such as Equation H25|) as referring, not to (averages of) 
outcomes of actually-performed experiments, but to the 
hidden variables themselves - to those features of the 
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state of the object which determine what the averages 
will be if a measurement is performed (but with no im- 
plication that such a measurement need actually be per- 
formed in order to give meaning to the expression). 

We thus conclude that, in addition to the assumption 
of locality (which nobody denies is present), the deriva- 
tion of the CHSH inequality requires also a second as- 
sumption: not the "counter-factual definiteness" (CFD) 
that is needed to derive the Bell inequality, that is true, 
but a weaker condition of counter-factual meaningful- 
ness (CFM, which can be roughly thought of as "CFD 
minus determinism" ) . CFM is the assumption that un- 
derwrites simultaneous talk about (the averages of) out- 
comes of incompatible experiments, and is practically 
equivalent to the assumption of non-contextuality (which 
is violated by both orthodox QM and contextual hidden 
variable theories such as Bohmian Mechanics, both of 
which are, however, non-local theories). Thus, the only 
familiar examples of theories satisfying the two condi- 
tions needed to derive the CHSH inequality would be 
local, non- contextual hidden variable theories. We there- 
fore conclude that it is misleading (or flat wrong) to as- 
sert that the CHSH inequality is an example of a Bell 
inequality "without hidden variables." |10] It's true that 
one need not explicitly assume hidden-variables in order 
to arrive at the inequality; but the CFM assumption one 
does need is, for all practical purposes, equivalent. 

IV. DISCUSSION 

We have identified three assumptions 

1. determinism 

2. locality 

3. counter-factual mcaningfulness (CFM) 

which function as premises in the derivation of the Bell 
inequality (which requires all three premises) and the 
CHSH inequality (which requires only the second and 
third). 

The status of the CHSH inequality relative to the Bell 
inequality has been obscured, in previous literature, by 
the packaging of our premises 1 and 3 into "counter- 
factual definiteness" (CFD) - the idea that there is some 
uniquely determined outcome to not-actually-performed 
measurements. This packaging has apparently led some 
people to the erroneous conclusion that, since the full 
CFD property is not assumed in the derivation of the 
CHSH inequality, it is based on no assumption other 
than locality. But this is false. One does still need the 
(weaker) CFM principle (and locality) to arrive at CHSH. 
And so, in principle 14] a violation of the CHSH inequal- 
ity could be blamed either on a failure of locality or a 
failure of counter- factual mcaningfulness (i.e., a failure of 
non-contextuality) . 

In another nice paper ,7] that argues for a Bell-type 
inequality making "no mention ... of 'hidden variables' 



or similar superstitions", Asher Peres characterizes the 
logical status of the inequality as follows: "Let us as- 
sume that the outcome of an experiment performed on 
one of the systems is independent of the choice of the 
experiment performed on the other. Now, let us try to 
imagine the results of alternative measurements, which 
could have been performed on the same systems instead 
of the actual measurements. Then there is no way of con- 
triving these hypothetical results so that they will satisfy 
all the quantum correlations with the results of the ac- 
tual measurements." Peres also discusses how the deriva- 
tion "involves a comparison of the results of experiments 
which were actually performed, with those of hypothet- 
ical experiments which could have been performed but 
were not" and points out that "it is impossible to imag- 
ine the latter results in a way compatible with (a) the 
results of the actually performed experiments, (b) long 
range separability of results of individual measurements, 
and (c) [the empirical predictions of] quantum mechan- 
ics." 

What then should we infer from the fact that the in- 
equalities are empirically violated? Peres tells us that 
"[t]here are two possible attitudes in the face of these 
results. One is to say that it is illegitimate to spec- 
ulate about unperformed experiments. In brief 'Thou 
shalt not think.' .... Alternatively, for those who cannot 
refrain from thinking, we can abandon the assumption 
that the results of measurements by A [lice] are indepen- 
dent of what is being done by B[ob]." These statements 
nicely summarize the conclusions of the current essay. 
Unlike many other authors, Peres correctly character- 
izes the additional assumption (beyond Locality) needed 
to derive the CHSH inequality as weaker than "counter- 
factual definiteness" - it is, rather, the assumption that 
it is meaningful to speculate at all about unperformed 
experiments. It's not just that one shouldn't think of 
them as having definite outcomes; one cannot even think 
of them as having a well-defined average outcome. In 
short, following Peres, one "shalt not think" about the 
results of un-performed (and/or un-performable) experi- 
ments at all. 

A theory which maintains allegiance to this orthodox 
principle ("Thou shalt not think" about un-performed 
experiments) could make predictions consistent with the 
inequalities (and hence, now, with experiment) even if 
it were local. Or so it might seem. But before spend- 
ing any time trying to concoct such a theory (which, if 
found, would be an explicit counterexample refuting the 
claims of those - from "class (1)" mentioned in the in- 
troduction - who think that the empirical violations of 
Bell's inequalities proves that nature is non-local) one 
would do well to remember the existence of the modified 
EPR argument (as detailed in 0). For this (unfortu- 
nately neglected, if not completely forgotten) argument 
proves that the various other principles needed to arrive 
at a Bell-type inequality can all be derived from the as- 
sumption of locality - that locality requires a local, non- 
contextual hidden-variables theory (to explain the fact 
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of perfect anti-correlation when Alice and Bob measure 
along the same axis). So when we bring this modified 
EPR argument back in, we see that, after all, there is 
no choice but to blame the empirical violations of Bell- 
type inequalities on the failure of the locality assumption. 
We cannot, after all, blame the Bell-violating data on a 
failure of CFM or any other principle going beyond or- 
thodoxy (other than locality). 

So our real conclusion is this: it is nice that we don't 
need to leave aside the modified EPR argument (as we 
have done throughout the current paper). Without that 
other argument (the first half of Bell's own two-part argu- 
ment for non-locality) we might accidently fool ourselves 
into thinking that the empirical data can be dealt with by 
rejecting something other than locality. But it can't - a 
realization which leaves us in a better position to appre- 
ciate Bell's statement that "For me then this is the real 
problem with quantum theory: the apparently essential 
conflict between any sharp formulation and fundamental 
relativity. That is to say, we have an apparent incompat- 
ibility, at the deepest level, between the two fundamental 
pillars of contemporary theory..." 0, pg 172] 



V. CONFESSION 

Aside from a few parenthetical qualifications snuck into 
the abstract and introduction, I have tried to present the 
arguments here as forcefully and univocally as possible. 
But I feel I should confess to being, myself, not at all 
convinced that these arguments are correct. The main 
thesis Tve argued for is that certain matematical expres- 
sions, which necessarily appear in the middle-stages of 
the derivation of generalized Bell inequalities, are simply 
meaningless from the point of view of contextual theo- 
ries (i.e., theories rejecting CFM), since these expressions 
contain a kind of implicit reference to incompatible pairs 
of measurements. 

But is there really a new and separate CFM assump- 
tion here? For example, in the original Bell derivation, 
once we allow the existence of functions such as A(a, A), 
haven't we already tacitly allowed that these functions 
are well-defined? That is, can't we simply regard A{a, A) 
and A{d',X) as "just the same functions ... with dif- 
ferent argument" (as Bell puts it in answering a related 
objection in his paper "Locality in quantum mechanics: 
reply to critics" Q])? That is, don't the kind of hid- 
den variables that are already on the table, based on the 
determinism and locality assumptions, support simulta- 
neous talk of A{a, A) and A{a', A) - as simply the values 
that the theory yields for outcomes along two possible 
(but not necessarily actual, and by no means "actually 
simultaneous" ) measurements? 

And then can't one argue in parallel for the average 
values that are used in the context of the CHSH deriva- 
tion, thus concluding that, indeed, the CHSH inequal- 
ity follows (in one step) from the locality assumption 
alone (just as the previously-mentioned anonymous ref- 



eree claimed)? 

The arguments presented in the earlier sections of this 
paper have the following basic structure: certain mathe- 
matical expressions appearing in the intermediate stages 
of the algebraic derivation of Bell- type inequalities, can 
be "translated back" into prose descriptions of certain 
correlation functions - i.e, expectation values for prod- 
ucts of certain sets of measurement outcomes. See, for 
example. Equation ifT^ and the subsequent prose trans- 
lation. This "back translation" is supposed to have been 
justified by the fact that it is merely doing, in reverse, 
what we did originally to write down a mathematical ex- 
pression ~ Equation (0) - for "the expected value of the 
product of Alice's measurement along a and Bob's mea- 
surement along 6" . And then, according to our earlier 
argument, since the "back translated" prose statement is 
operationally meaningless (because it refers to the prod- 
uct of outcomes of incompatible measurements) , so is the 
corresponding mathematical expression. 

But who says we need to - or, indeed, are entitled to 

- make this "back translation"? By the time we get to 
the relevant stage in the algebraic derivation, there is no 
question about the meaningfulness of the individual fac- 
tors in (for example) Equation (|15|l . Each is simply the 
outcome that the theory in question predicts for the mea- 
surement in question. And if each of those is individually 
meaningful, how in the world can there be any problem 
in multiplying them together (and then averaging the re- 
sult over the possible states A that might be, according 
to the theory, produced by the preparation procedure)? 
Sure, if we translate the math back into prose in a certain 
way, we get something that is operationally meaningless. 
But we could just as easily - and, I would argue, more 
faithfully - translate the mathematical expression into 
prose this way: Equation (|15|l represents 

• the average (over possible As) of the product of 

(i) the value that the theory predicts for the out- 
come of a measurement by Alice along d 

(ii) the value that the theory predicts for the 
outcome of a measurement by Alice along b 
(squared), and 

(iii) the value that the theory predicts for the out- 
come of a measurement by Alice along c. 

And this way there is no reference to the actual outcomes 
of actually-performed measurements that are incompat- 
ible, no tacit assumption of CFM. We are simply talk- 
ing about what some theory says will happen in various 
circumstances, and mathematically manipulating values 
that are perfectly well defined. There seems to then be no 
problem - no mysterious third anti-orthodox assumption 

- whatsoever. 

I think the fundamental point here is one that has al- 
ready been emphasized, from a slightly different point 
of view, in P]. Confusion arises when one forgets (in 
the context of analyzing Bell's theorem) that one is not 
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talking directly about measurement outcomes, but about 
theories and their predictions. (This confusion is prob- 
ably prevalent because of the influence of philosophical 
positivism on the quantum founding fathers and their fol- 
lowers.) In the context of the present paper, it seems that 
as long as one keeps this in mind - as long as one resists 
the temptation to require a direct operationalist inter- 
pretation for every intermediate stage in the algebraic 
derivation - it emerges that there is no distinct "CFM" 
assumption in the derivation of the Bell or CHSH inequal- 
ities. And hence it emerges that the CHSH inequality 
in particular follows from locality alone, such that its 
empirical violation can only be blamed on the non-local 
character of nature. 

Nevertheless (a meta-confession) I am not 100% cer- 
tain that my own objection to my own arguments is cor- 
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rect. So, at least for the time being, I am relieved that the 
approach taken in exists - that is, the approach of us- 
ing a modified EPR argument as an "end run" around the 
question of CFM. The associated two-part argument for 
nonlocality still seems (to me) to be the most straightfor- 
ward, least subtle, and most airtight proof that locality 
(and, in this context, nothing else) has been empirically 
refuted. 
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